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Summary. In this paper we propose a multilevel model specification with time series components 
for the analysis of prices of artworks sold at auctions. Since auction data do not constitute a panel 
or a time series but are composed of repeated cross-sections they require a specification with items 
at the first level nested in time points. An original feature of our approach is the derivation of full 
maximum likelihood estimators through the E-M algorithm. The data analysed come from the first 
database of ethnic artworks sold in the most important auctions worldwide. The results show that the 
new specification improves considerably over existing proposals both in terms of fit and prediction. 
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1. Introduction 



Nowadays, artwork items are considered investment assets similarly to stocks, bonds and real es- 
tates. For this reason, in the recent past, the analysis of this new market segment was performed 

'bl ■ by resorting to tools for the analysis of financial markets. However, such tools miss some essential 

aspects of the art market. Indeed, contrarily to stocks that are exchanged a high number of times in 
each instant, artworks are one-off pieces of their kind, hardly comparable with each other, and they 
pass through the market only a handful of times (usually only one). A further substantial difference 
with respect to financial assets is that works of art provide aest hetic pleasure and social status to 

^^ I its owner other than mere monetary returns ( Goetzmannl . Il993l ). Moreover, there are considerable 



H ■ transaction costs and, last but not least, there are no publicly available good databases on this 

. - -' segment. Hence, the study of the art market requires new tools and renovated research efforts. 

One of the most important problems in the analysis of art markets is the study of price in- 
dexes for artwork items. In the Art Economics literature several proposals have been discussed, 
especially for paintings. Among the most important cont ributions we mention Sotheby's Art In- 
dex (an d similar others), the averag e yaintinq methodology ( Steinl . ll977), the represen tative painting 



method ( Candela and Scorcul . ll997l ). the repeated sales regression (JGoetzmannl . ll993l ) and the hedo 



nic regression, called also the grey painting method. The hedonic regression model is the most used 
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approach for modelling art price s ; the id ea is due tolRoseril (119741 ) , whe r eas developments and appl i- 
cations can be foun d in Chanell (1995 ): Ginsburgh and Jeanfild (19951 ): Agnello and Pierce! (|l996( ): 
Chanel et al.l ( 19961 ): ICollins et al.l ( 20091 ): iLocatelli Biey and Zanolal ( 20051 ). The method assumes 
that the price of an artwork depends both on the market trend and on certain object characteristics. 
Such dependence is modelled through a fixed effect regression. In particular, the estimated regres- 
sion coefficients are interpreted as the price of each feature, the so-called shadow price, assumed to 
be constant over time. Hence, it is possible to predict the price of a given object by summing the 
prices of its features. Also, a time-dependent intercept can represent the value of the grey painting 
in that period, that is, the value of an art work created by a s t andard artist, through standard 
techn iques, with standard dimensions, etc. (jCandela and Scorcul . 120041 : ILocatelli Biev and Zanolal . 
20051 ). The final market price index is built from the prices of the grey painting in different periods. 
The hedonic regression model has the advantage of solving the problem of artwork heterogeneity by 
explaining prices through object features; also, it allows to derive a price index by neutralizing the 
effect of quality. Nevertheless, such method presents several drawbacks. First of all, it is difficult to 
account for all the relevant features that determine the price of an object, so that only a part of the 
price is explained. Moreover, most of the object features are categorical, such as, for example, the 
artist's name that in Western art strongly affects the price of artworks. Therefore, the regression 
equation will contain many dummy variables and, consequently, a high number of parameters to be 
estimated, so that the resulting models are not parsimonious. Most importantly, it is not possible 
to forecast prices as the time dynamics is not modelled explicitly. In fact the price index relies only 
on the estimated coefficients of time-dependent dummy variables. 

In order to overcome t he limits of the hedonic regression model, we propose a multile v el ap - 
proach. Multilevel data (JGoldsteinl . l2010t iLaird and Ward . Il982l : iRaudenbush and Brvkl . l2002l ) 
consist of units of analysis of different type, one hierarchically clustered within the other. At the 
lowest level (level-1 observations) such units can be described by some variables; furthermore, they 
are also grouped into larger units (higher level observations), which in turn could be described by 
other variables. The general specification of multilevel models rSkrond al and Rabe-Heskethl . 120041 ) 
allows a large variety of applications. In particular, repeated measures dat a can be seen as a spe- 
cific case of multilevel da ta with occasions i at level-1 and units j at level 2 ( Van der Leedenl . ll998t 
Maas and Sniiderg| . l2003l ). The dependence among lev el-1 er rors that characterize panel data can be 
han dled by including corre l ation structures at level-1 (JGoldstein . 201Q) . For instance. Jones (1993)) 
and lVonesh and Chinchillil ( 19971 ) model the residual errors through a ffi'st-order autoregression or 
autoregressive moving-average (ARMA) processes. Moreover, i t is possible to allow heteroscedas- 
tic within-group errors through variance functions ([Davidian and Giltina n, 1995) . This flexibility 
in the specification of covariance structures represents an important feature of linear mixed-effect 
models for longitudinal data. In all these cases, any time dependence is modelled at the first level. 

Since auction data do not constitute a proper panel the multilevel approach for longitudinal 
data described above cannot be applied. Indeed, auction data have a structure similar to that 
of repeated cross-sectional surveys. The main aim of this work is to propose a multilevel model 
specification that is particularly suitable to handling prices of artworks sold at auctions over time. 
Such data consist of observations on individual survey respondents drawn from the same context 
(e.g. t he same country) at many different time-points; therefore, they can be clustered in time- 
points (JFirebaughl 119971) so that, e ven if it is not possible to f ollow specific individuals over time, 
they allow to catch social changes. iDiPrete and Gruskvl (|l990l ) were the first to adopt a multilevel 
framework to analyze repeated cross-sectional data. They called their model single- context multi- 
level model as opposed to the traditional multiple- contexts model. The substantial difference with 
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traditional models is the serial correlation among level-2 units/time-points. The authors took into 
account this case by deriving a gener alized least-square estimator. A similar idea has been consid- 
ered bv lBrowne and GoldsteinI ( 20101 ) but for spatial correlations and in a Bayesian framework. In 



their work, the independence assumption among level-2 disturbances is relaxed and the correlation 
between pairs of clusters is modelled through an explicit function of the distance between them. 
However, to our knowledge, such approaches are not implemented in any software handling mul- 
tilevel model. Moreover, a full maximum likelihood approach for this specification has not been 
considered. For these reasons, despite the wide potential interest, multilevel models for this kind 
of data are poorly developed and seldom applied. 

In this paper we aim to fill in these gaps in many respects. We derive full maximum likelihood 
es timators with known d esirab le properties for the multilevel specification similar to that presented 
in iDiPrete and Gruskvl (J1990I ). We treat auction data as repeated cross-sections by taking indi- 



viduals (in our case artwork items) as level-1 units and time-points as level-2 units. Hence, the 
price dynamics over time are modelled at the second level by means of an autoregressive structure 
of first order between random effects, as required by the case under investigation. The proposal 
combines the flexibility of mixed effect models together with the predicting performance of time 
series components. This specification turns out to be a natural and more convenient choice over the 
hedonic regression for modelling artwork prices. The overall result is a parsimonious yet powerful 
specification that can also reveal a useful tool to forecast the future values of the price. We obtain 
model estimates through an EM iterative algorithm and derive robust standard errors by means of a 
bootstrap scheme. The work has been motivated by the analysis of the first world database of Tribal 
art prices. Such database has been built by a team of researchers of the University of Bologna, 
Faculty of Economics - Rimini, in conjunction with other institutions, and contains information on 
over 20000 artwork items sold by the most important auction houses from 1998 onwards. 

The paper is organized as follows. The database of Tribal art prices is described in the next 
Section. In Section [3] we present the multilevel specification for Tribal art data and compare it 
with the traditional hedonic approach. Section 2] contains the main theoretical contribution of this 
paper, that is the extension of the multilevel model to deal with the time dependence at the second 
level through a maximum likelihood approach. Section [5] describes the results of the new model 
fitted on Tribal art data and compares them with those of the classic version. Also, the predictive 
performance of the three models is assessed. Finally, conclusions and discussions of future research 
are provided in Section [51 

2. The first database of Tribal artworics 

The important problem of the construction of price indexes for art markets requires data on sales 
of artworks. At present, the only available information in this area comes from auction exchanges. 
Nowadays, there are private companies (e.g. Artnet.com, Artinfo.com, Arsvalue.com, Artprice.com) 
that publish and sell information about auctions and price indexes, as well as art evaluations and 
other services. However, most of these companies deal with Western art. In this scenario, for a long 
time, there has not been a database on the Ethnic art. In recent years, the turnover of the Tribal art 
market (see Figure [IJleft)) attracted the interest of investors and economists. The first database on 
Ethnic artworks has been created in 2006 from the agreement of four institutions: the Department of 
Economics of the University of the Italian Switzerland, the Museum of the Extraeuropean cultures 
in Lugano, the Museo degli Sguardi in Rimini, and the Faculty of Economics of the University of 
Bologna, campus of Rimini. For each object, 37 variables are recorded from the paper catalogues 
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Fig. 1. (left): yearly turnover (euro) in logarithmic scale (base 10) of the Tribal art market; (right): boxplots of 
prices by year. The amount of items sold in a given year is reported inside the boxes. 



released by the auction houses before the auctions; such variables include physical, historical and 
market characteristics. After the auction, the information on the selling price is added to the record. 



In Figure [TJleft) we report the boxplots of logged prices aggregated by year (inside the boxes, 
the total amount of item sold in each year). The plot provides a visual description of the structure 
of the dataset: a different group of artworks is sold each year; e.g. 1322 items were auctioned in 
1998, 1347 objects different from the first set are sold in 1999, and so on. It is clear that Tribal art 
data do not constitute either a panel or a time series but has a structure like that of repeated cross- 
sectional surveys. Moreover, the medians (black lines) give an idea of the trend of prices over time. 
2003 has been the most unsatisfying year but also the one with the highest number of sold artworks. 
After this period, the market experienced a gradual increase in prices and overall turnover. The 
fall in turnover in 2009, instead, is likely due to the decrease of the number of auctioned items. 
However, although the object supply has become scarcer in recent years (compare the low number 
of sold items in the boxplots and the quite high percentage of sales of Figure [TJright)), the turnover 
is not suffering the same decline due to higher prices. Overall, the positive trend gives an idea of 
the great potential of the Tribal art market. 

In this paper we study the dependence of prices of artworks on available characteristics over the 
time span 1998-2011 for an overall 14206 items. All hammer prices have been deflated through the 
HICP (Harmonized Index of Consumer Prices) and transformed in euro. The characteristics of items 
used as explanatory variables are listed in Table [T] Also, based on theoretical arguments we include 
the interactions of the pairs "illustration type" -"width of the illustration" and "auction house" - 
"venue" . Indeed, the Cramer's V pairwise association index for such variables is q uite high (0.71 



and .69 re spectively) . For further de t ails a nd descriptive analysis of the dataset see lCandela et al. 
(|2012l) and lModugno and Giannerinil (|2008[ ). 
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Table 1. Covariates of the models. 





Variable 


Categories 




Type of object 


Furniture, Sticks, Masks, 
Religious objects, Ornaments, 
Sculptures, Musical instruments. 
Tools, Clothing, Textiles, 
Weapons, Jewels 


Physical 


Material 


Ivory, Vegetable fibre. Wood, 

Metal, Gold, Stone, 

Precious stone. Terracotta, ceramic. 

Silver, Textile and hides 

Seashell, Bone, horn, Not indicated 




Patina 


Not indicated, Pejorative, 
Present, Appreciative 




Region 


Central, Southern, Western, Eastern and 
Northern Africa, Australia, Indonesia, 
Melanesia, Polynesia, Mesoamerica, 
Northern and Southern America, Micronesia, 
Far Eastern, Indian Region, 
Southeastern Asia, Middle East 




Illustration on the catalogue 


Absent, Black/white, Coloured, 




Illustration width 


Absent, Miscellaneous, 
Quarter page. Half page, 


Hystorical 




Full page. More than one. 
Cover 




Description 


Absent, Short visual. Visual, 

Broad visual. Critical, Broad critical. 




Specialized bibliography 


Yes, No 




Comparative bibliography 


Yes, No 




Exhibition 


Yes, No 




Historicization 


Absent, Museum certification. 
Relevant museum certification, 
Simple certification 


Market 


Venue 


New York, Paris, 


Auction house 


Sotheby's, Christie's, 
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3. A multilevel model for Tribal art prices 

Among the existing proposals for modelling art prices, the hedonic regression model is suitable for 
Tribal art data. Indeed, such approach seems more suitable for Ethnic art than for Western art data. 
One reason for this is that Tribal art is considered an anonymous art since ethnic objects are not 
characterized by their artist's name (unknown) but by their ethnic provenance. Since the number 
of ethnic groups is generally smaller than the number of artists' names, the hedonic model for the 
Tribal art results in less dummy variables than those applied to other art segments. Moreover, the 
amount of iconographic subjects and materials is more limited. Therefore, some of the drawbacks 
of the hedonic regression method are less pronounced when applied to Tribal data. The regression 
model for the price of artworks corresponding to the hedonic regression specification, that we call 
"FE" (standing for Fixed Effects), can be expressed as 

logio(2/^t) = f3ot + ^IP + Qt, eu\^l - NID(0, a^) (1) 

where yu is the observed price for the time-point t = I, . . . , T and the item z = I, . . . , rij and x^t the 
correspondent set of covariates listed in Table [TJ /3ot represents the mean price of the time-points 
t. In our specific case, we have chosen to take the semesters as time-points rather than the auction 
dates, mainly due to three reasons: 

(a) the auctions are organized in two sessions, one during the winter and one during the summer, 
and each session contains two to four auctions quite close in time; the concentration in time 
and space allows to exploit scale economies; 

(b) in general, the stakeholders look at the performance of the previous semester; 

(c) auction dates are not equally spaced in time and this feature is important for modelling time 
dependence. 

In the dataset, the number of semesters is T = 27, and nt, the number of items sold in the semester 
t, varies between 80 (semester 2010-2) and 915 (semester 1998-2); the overall sample size of sold 
items {n — X]t=i '^t) is 14206. 

The FE model fails to capture some essential features of the price dynamics. First, such model 
is not parsimonious in that both time-effects and categorical covariates are included as dummy 
variables. Also, the time dummy approach does not allow to m odel directly the dy namics of prices 
over time and all the effects are assumed constant over time ( Collins et al.l . 120091) . Furthermore, 



potential sources of heterogeneity and heteroscedasticity cannot be accounted for by the hedonic 
regression model. Last but not least. Tribal art data possess a hierarchical structure which is 
completely disregarded. For these reasons we propose a multilevel specification which is capable 
of addressing the aforementioned issues. The task requires a suitable modification of the classic 
multilevel model. As already highlighted, since we observe different artworks sold at every auction. 
Tribal art data do not constitute either a panel or a time series. Rather, they can be thought to 
have a two-level structure in that items, level-1 units, are grouped in time points, level-2 units. 
Hence, the idea is to exploit the multilevel model to explain heterogeneity of prices among time 
points. 

The two-level model, that we call "RE" (standing for Random Effects), has Eq. ([T]) as the levcl-1 
model, whereas the level-2 model is 

Pm=Po + uu wt|Xt~NID(0,f72), utU^t (2) 
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where /3o is the overall mean price and ut is a random intercept for the semester t. Note that yu and 
yi{t+i) do not represent the price of item i observed at successive time points, rather, yu indicates 
the price of the i-th object observed at time-point t, whereas yi(^t+i) is the price of the i-th object 
at time-point t + 1. The two objects are physically different. One could even specify the temporal 
dependence of the subscript i by changing it in if. However, as this would lead to an unnecessary 
complication in the notation we have chosen the present form. 

In the first and in the second column of Table [2] we present the results of the hedonic regression 
fit (FE) and the multilevel model (RE) respectively, both of which have been fit through the 
maximum likelihood method to allow comparisons. The current specification has been driven 
by both theoretical (Art Economics) and empirical arguments. In practice, all the parameters 
result significant. Also, notice the magnitude of the effects (with interaction) related to market 
characteristics such as auction house, venue and illustration. 

The two models produce similar results. In particular, besides the estimated coefficients, also 
the time effects (Semester effect) are very close, although in the FE model these values are es- 
timated coefficients I3qt wh ereas in the RE model they are Best Linear Unbias ed Pre d iction s 
/3o + Ut (jSearle et al.l . Il992l ). This is due to the very high shrinkage factor (,Goldsteinl . 120101 ). 



In fact, if we consider the cluster means of model ([!]): 

yt=/3ot+x?/3 + et, (3) 

we have that the estimates of time-specific intercepts correspond to the group means 

/3ot = yt~ x^^. (4) 

On the other hand, the group means for the RE model are obtained as 

i3ot=i3o + ut = i3o + Xt{yt-Po~iJ^), (5) 

where 

A* = ^r,^ (6) 

is the shrinkage factor that can be interpreted as the estimated reliability of the mean raw residual 
as a predictor of Ut- Indeed, the shrinkage factor takes values in [0, 1] and pulls the group means 
towards the overall mean by an amount depending both on nt and on the variance components. 
Since, in our case, the group sample sizes are big as compared to the variance components, the 
shrinkage factor is close to one for each t. Therefore, the time-effects are almost coincident for the 
two models because each group-specific mean dominates over the population mean. 

Besides the similar parameter estimates, the multilevel model includes a further variability 
component, the between-group variance, a'^. The significance of its estimate has been positively 
assessed through a likelihood ratio test between this model and its unrestricted version (Eq. ^ 
with Pot = Po)] since the null hypothesis of zero variance is on the boundary of the feasible 
para meter space, we used half of the p- value obtained from the tables of the chi-squared distribu- 



tion (ISelf and Liana . 119871 ) . The proportion of the total variability of prices explained by the vari- 
ability among semesters results 100 * cru/(cr2 + cr^) = 14.5%, that, in a two-level random-intercept 
model, corresponds to the Intra-class correlation (ICC), the correlation between two observations in 
the same se mester. The exist ence of a non-zero ICC reveals the inadequacy of traditional modelling 



frameworks ( Goldsteinl . 120101) 
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Fig. 2. Global (left) and partial (right) autocorrelation functions of level-2 residuals of the RE model. 



As concerns the diagnostic analysis, the Shapiro- Wilk test points to a deviation from norniahty 
in level-1 residuals whereas it does not reject the assumption of normality for level-2 residuals. 
Given the non- normality at level-1, in order to test the assumption of homog eneity o f the v ariance 
across clusters , we use a non-parametric version of the homogeneity test of iLevend ( 19601 ) which 
is rank-based ( Kruskal and Wallid . Il952n . The results indicate that level-1 variances change over 
time. To cope with these problems, we have computed robust standard errors for the estimates 
through a modified version of the Wild Bootstrap procedure, described in subsection 13.11 Such 
scheme is robust with respect to heteroscedastic and non Gaussian errors. 

In order to assess the assumption that the error process ut is a white noise (conditionally to the 
covariates), we have computed the global and partial autocorrelation functions of level-2 residuals 
(Figure [2]). Clearly, the correlograms point to an autoregressive-like structure, similar to that of 
an AR(1) process. In summary, the RE model ((T)) and (I2|) produces results very similar to those 
of the traditional FE model ^ in terms of estimates and residuals, but with greater parsimony. 
In addition, the multilevel model is able to explain a proportion of variability of the price through 
the variability among semesters. The assumptions of normality and homogeneity of variance across 
groups for level-1 errors of both models are not satisfied so that we have used robust bootstrap 
standard errors. On the other hand, the predicted random effects are normally distributed with zero 
mean, but they are not independent for different groups as they show a peculiar autocorrelation 
structure. Improving the classical multilevel model to deal with the latter issue requires relaxing the 
assumption of independence among random effects. Since in the analysis of Tribal art data these 
represent time effects, the inclusion of such correlations implies treating them as a time series. As 
mentioned above, the correlograms of the residuals suggest the specification of an AR(1) model. 
Section 0] is devoted to the specification and the estimation of such model. 



3.1. Robust standard errors through the wild bootstrap procedure 

The w ild bootstrap was developed by iLiul (jl988f ) following suggestions in IWul (|l986l ) and Beran 
( 19861 ). Further evidences and refinements for classic regression models are provided in iFlachaire 
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(|2004l ) and iDavidson and Flachaird (2008). Here, we adopt the w ild bootstrap procedure adapted 



to the case of hierarchical data in iModugno and Giannerinil (|2013l ) . 



Consider the random-intercept model for the {rit x 1) response of the generic group t: 
where 

Ut ^ IntUt + et, 

for all t — 1, . . . ,T. The disturbances are assumed to be mutually independent and to have zero 
expectation, but they are allowed to be heteroscedastic. Moreover, the covariates are assumed to 
be strictly exogenous. 

Denoting with Ht — Xf(X'^X)^^X^ the orthogonal projection matrix corresponding to design 
matrix Xj, we replace the residual vector 'Vt — yt — l^o — Xt/3 by the vector 

vt = diag(l„^ - Ht) o vt, 

where the operator "o" denotes the Hadamard (or entry wise) product. Then, the bootstrap proce- 
dure used is as follows: 

(a) draw independently T values, wt, for t = 1,...,T, from the following two-point auxiliary 
distribution: 

r -(^/5 - l)/2 with probability p = (\/5 + l)/(2\/5) , , 

\ (V5 + l)/2 with probability 1 - p, 

with zero mean and unitary variance; 

(b) generate the bootstrap samples as 

y: = ^0 + Xt^ -I- v: 

where v^ = VtWt, 

(c) compute estimates on the bootstrap sample y*; 

(d) repeat steps 1-3 B times and compute bootstrap standard errors as 



\ 



-bEw 



B - ^ 

b=l 



where 6 is the vector of the ML estimates. 



iModugno and Giannerinil ( 20131 ) show that this version of the wild bootstrap behave well in case 



of heteroscedasticity and non-normality, and, most of all, outperforms the other bootstrap schemes 
used for multilevel data. 

4. A multilevel model with autoregressive components 

In this section we propose an extension of the multilevel model, proposed in Section[31 that consists 
in relaxing the assumption of independence among random effects and treating them as a time 
series at the second level. The section has two subsections: the first one describes the specification 
of the model whereas the second subsection presents the implementation of the estimators in the 
maximum likelihood framework. 
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4.1. Model specification 

Consider a random intercept model with k level-1 covariates: 



Vit = Pot + X^j/3 + £»; 



ErflXit 



NID(0,cr2) 



(8) 



for i = 1, . . . , nt and /: = 1, . . . , T. The slopes j3 are fixed; the intercepts /3ot are group-specific and 
random, and they are modeled as 

/3ot=/3o+wt, (9) 

where Ut represents the deviation of the group-specific intercept /3ot from the overall mean, /3o. 
The usual assumption of independence for the random effects in ([2|) is relaxed by assuming an 
autoregressive process of order 1 for level-2 errors: 



Ut = put-i + r/f , r?t|Xt ~ NID(0, a^), 

with \p\ < 1 (that guarantees stationarity), rjt-Lus and rji-Lea for all s < i and for all i. 
Under these assumptions the dependent variable has the following distribution 



(10) 



with 



y^t--N[l3o+x,t|3,cr' + (f)o 



00 = Var(ui) 



l-p2- 

In matrix form, the composite model for the whole response vector is 

y = X/3 + Zb + e, 

where Z is known as random effect design matrix and b = ( /3oi /3o2 
spondent vector of random intercepts with covariance matrix 



A 



OT 



= (Po 



1 
P 



P 



,T^1 
^T-2 



pT-1 pT-2 



Thus, we have: 



y^N(/3o + X/3,ZrZ- + a2l„). 



(11) 
(12) 

(13) 
is the corre- 

(14) 

(15) 



4.2. Model estimation 

Model estimation is performed by using the full maximum likelihood estimation method through 
the E-M algorithm, since the random effects are unobserved. 

The set of parameters of the multilevel model with AR(1) random effects to be estimated is 
9 = {/3o, /3, cr^, p, cr^}. The log-likelihood function associated with the response vector y is given by 



^(0;y)-lnL(0;y) 



-| ln(2^) - i In \n\ - i(y - X/3)-0-i(y - X^). 



(16) 



where 
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n = zrz^ + a%, (17) 



is the covariance matrix of y, and 

X= [ 1„ X ] and /3= [ /3o /3 Y 



(18) 



are the matrix design and the coefficients vector including the intercept, respectively. 

To simplify the notation, we separate the set of parameters of the model into two subsets: 
6 = {61,62}, where the subset 61 = {/3, cr^} includes the level-1 parameters, and 62 = {/3o,P, cr,^} 
is the set of level-2 parameters. 

The complete log-likelihood of the observed and unobserved data can be expressed as the sum 
of two separate components 



where 



and 



£i(0i)=ln/(y|b) 



i{6;y,h) =\nL{6;y,h) = e,{6,) + £2(92) 

(y-X/3-Zb)-(y-X/3-Zb) 



(19) 



-|ln(27ra2) 



2a2 



T 
2" 



U92) = hi /(b) = -^\n{27ral) + - ln(l - p') 



2a2 



(20) 



The matrix V = -^T and it is straightforward to show that (JHamiltonl . ll994 ) 



V 



1 -p 

-p l + p2 

-p 





-p 
l + p2 









1+p' 

~p 



(21) 



The estimation of 6 through the E-M algorithm consists of two steps, the Expectation (E) and 
Maximization (M) step described in detail in the following. 

E step In the expectation step the expected score functions of the parameter conditioned to the 
observed data are computed on the basis of current value of 6, denoted as , as follows: 

E[^(0;y,b)|y,0('^)] =E[5i(0i)|y,0('^)] +^S2{62)\y,6^'^^] (22) 

where S{6) = d£{6;y)/d6, 5i(0i) = dii{6i)/d6i and S'2(6>2) ^ 9^2(02)/902- 

The expressions of the expected score functions with respect to level-1 parameters of the 

model are given by 



1 



E(5i(/3)|y;0('')) = — (-X^X^/S + X^y - X^Zb) 
E(5,(.2)|y;.('^)) = -^+(y-^^-^^)^(y-y-"^)+^-(^^^^). 



(23) 
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The expression of the expected score functions with respect to the level-2 parameters of the 
model are 



E(52(/3o)|y;e('')) 



(1 - p)Ch + br) + (1 - pY EL"/ bt-{l- p){T - (T - 2)p)/3o 



E(52(a2)|y;0('')) = - — 



T ^ tr(V-iB) + u^V-iu 
2^ 



2a4 



where 



•n 

T-l T-l 

^ (Bt,t+i + utut+i) - p ^ (B 

t=l t=2 



M + "t) 



1-p^ 



u = E(u|y)=rZ-n-i(y-l„/?o-X/3) 

b = /3o + u 

B = Var(b|y) =V TZ^rt^ZT^ 



(24) 
(25) 



M step It consists in maximizing the conditional expected value of the log-likelihood ([33]) com- 
puted in the E-step, getting maximum likelihood estimates of the model parameters. In detail, 
the current values of vector of parameters 0^ '^ -' are updated as follows 

^^'''^^^ = (X^X)-iX^(y-Zb) 
,2 (^+1) _ (y - X/3 - Zb)^(y - X/3 - Zb) + tr(Z^ZB) 



(a^) 






(-^) 



2^(/l+l) 



T-{T- 2)p 
tr(V-iB) + u^V-iu 
T 



Since we get non linear maximum likelihood equation for the parameter p, we update its 
current value through an iteration of the Newton-Raphson scheme. 



The E-M algorithm consists of the following steps 

(a) Choose an initial value for the parameters 9; 

(b) Compute the expected score functions for all the parameters (E-step); 

(c) Obtain improved parameter estimates (M-step); 

(d) Repeat steps 2 and 3 until convergence, that is, until 



£(61 



(/i+i). 



y,b)-£(0('');y,b) 



(26) 



is arbitrarily small. 



The EM algorithm produces the Empirical Bayes prediction for the random effects b, namely, the 



mean of their conditional distribution with respect to the observed data y as in (124)) (jSearle et al. 



1992) ■ The whole algorithm has been implemented in R with an original code. Further de t ails o n 
the implementation and a Monte Carlo study based on the code can be found in iModugnd (|2012n . 
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5. Application of the new model to Tribal art data 

In this section, we present the results of the fit of the new model upon the Tribal art dataset. 
Moreover, we will compare the predicting capability of the three models under scrutiny. Consider 
the model in equations (|H]) , (jH]) and ([TU]) , with the same set of covariates as in the FE specification 
reported in Table [TJ We call it "ARE" standing for Autoregressive Random Effects. The results 
are shown in the third column of Table [2] The estimates and the predicted random effects are 
quite close to those from the RE model (second column). In this case, the estimated between- 
group variance, that takes the form Var(ut) = cr^/(l — p^), results 0.036, slightly bigger than 
that of the RE model (ct^ = 0.029). Consequently, the proportion of variability explained by the 
between-semesters variance (ICC) is bigger for the new model, 17.3% against 14.5%. Also, the 
level-2 residual variability of the ARE model a\f^ = 0.010 is smaller than that of the RE model, 
afi^ = 0.029. This confirms that the structure at the second level has been taken into account by 
the new specification. Furthermore, the estimate of the autoregressive parameter p is quite high, 
p = 0.843 and agrees with the evidence of the correlograms of the residuals of the RE model (see 
Figure [2]). The last column reports /3o + ut of the ARE model to facilitate the comparison with the 
FE semester effects. 

Note that when p is zero, the random effects are independent and the multilevel model reduces 
to the RE specification. Hence, the ARE and the RE models are nested so that we can use the 
likelihood ratio test for assessing the significance of p. According both to the LR test and to the 
Information Criteria (see Tabled]), the ARE model provides a better fit than the RE model. 

The diagnostic checks show that the ARE-model presents the same features of non-normality 
and non- homogeneity of variance among groups as the FE- model (section [3]). Therefore, also in 
this case, we have computed robust standard errors through the wild bootstrap procedure. The 
prese nce of AR(1) random effec t s requ ires a further extension of the wild bootstrap for hierarchical 
data ( Modugno and Giannerinil . 120131 ) that consists in replacing step (b) of subsection l3.1l with the 



following: 
(b) generate the bootstrap samples as 

for i = 1, . . . , n and for t = 1, . . . , T, where u^ is an autoregressive process with disturbances 
equal to Wt-qflt and 

hi is i-th diagonal element of the orthogonal projection matrix of X; 

such modification takes into account both the time dependence at the second level and the het- 
eroscedasticity at the first level. 

The autocorrelation functions (global and partial) of level-2 residuals (see Figure [3]) do not 
reveal any structure as the values lie within the rejection bands at level 95% at all lags. Hence, our 
novel specification has successfully captured the time dependence of the price dynamics by means of 
the autoregressive specification at the second level. Finally, Table [S] summarizes and compares the 
prediction capability of the three models under study. The aggregate measures of prediction error 
are the Mean Absolute (Prediction) Error MAE = — - — Y^^=i \yi,T+i — yi.T+i\ and the Root Mean 

Square (Prediction) Error RMSE = J:;;;;^ Tl^lV [VhT+i - yi,T+i) ■ The first two rows of Table [S] 
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Table 2. Results of the fit for models FE G), RE Q and © and ARE <8), (9) and QO 
Bootstrap standard errors in parentheses. 





FE 


RE 


ARE 




AIC 


15576 


15671 


15647 




BIG 


16317 


16223 


16207 




# par am. 


98 


73 


74 




" '2 

a 


0.173 (0) 


0.173 (0.036) 


0.173 (0.043) 




" 2 


- 


0.029 (0.009) 


- 




a^ 


- 


0.01 (0.013) 






p 


- 


- 


0.843 (0.128) 




ICC 


- 


0.145 


0.173 




/?o 


- 


2.216 (0.068) 


2.212 (0.112) 




Semester effect 


Pot 


Ut 


Ut 


Po + Ut 


1998-1 


1.96 (0.075) 


-0.254 (0.022) 


-0.257 (0.021) 


1.983 


1998-2 


2.081 (0.07) 


-0.137 (0.016) 


-0.143 (0.022) 


2.097 


1999-1 


2.15 (0.072) 


-0.068 (0.019) 


-0.069 (0.025) 


2.170 


1999-2 


2.355 (0.072) 


0.135 (0.017) 


0.13 (0.023) 


2.369 


2000-1 


2.454 (0.071) 


0.234 (0.016) 


0.229 (0.023) 


2.468 


2000-2 


2.418 (0.071) 


0.197 (0.016) 


0.195 (0.021) 


2.435 


2001-1 


2.393 (0.074) 


0.171 (0.02) 


0.165 (0.025) 


2.405 


2001-2 


2.244 (0.077) 


0.025 (0.025) 


0.038 (0.029) 


2.277 


2002-1 


2.352 (0.071) 


0.133 (0.017) 


0.12 (0.024) 


2.360 


2002-2 


2.15 (0.075) 


-0.066 (0.024) 


-0.068 (0.028) 


2.171 


2003-1 


2.031 (0.073) 


-0.185 (0.017) 


-0.191 (0.023) 


2.048 


2003-2 


1.932 (0.071) 


-0.283 (0.016) 


-0.29 (0.023) 


1.949 


2004-1 


1.911 (0.072) 


-0.304 (0.019) 


-0.31 (0.025) 


1.930 


2004-2 


2.029 (0.072) 


-0.186 (0.016) 


-0.193 (0.024) 


2.047 


2005-1 


2.204 (0.073) 


-0.014 (0.018) 


-0.025 (0.025) 


2.215 


2005-2 


2.175 (0.073) 


-0.043 (0.017) 


-0.048 (0.024) 


2.191 


2006-1 


2.192 (0.072) 


-0.025 (0.019) 


-0.035 (0.025) 


2.205 


2006-2 


2.09 (0.073) 


-0.126 (0.016) 


-0.13 (0.023) 


2.109 


2007-1 


2.151 (0.072) 


-0.066 (0.017) 


-0.073 (0.024) 


2.166 


2007-2 


2.195 (0.07) 


-0.023 (0.019) 


-0.03 (0.027) 


2.209 


2008-1 


2.196 (0.068) 


-0.022 (0.022) 


-0.031 (0.028) 


2.208 


2008-2 


2.119 (0.074) 


-0.098 (0.017) 


-0.101 (0.025) 


2.139 


2009-1 


2.225 (0.07) 


0.006 (0.019) 


0.003 (0.026) 


2.243 


2009-2 


2.438 (0.08) 


0.213 (0.032) 


0.201 (0.037) 


2.440 


2010-1 


2.449 (0.079) 


0.226 (0.03) 


0.227 (0.035) 


2.466 


2010-2 


2.523 (0.097) 


0.282 (0.056) 


0.285 (0.055) 


2.525 


2011-1 


2.506 (0.075) 


0.28 (0.033) 


0.279 (0.03) 


2.519 


Type 


of object: basefine Furniture 






Sticks 


-0.093 (0.026) 


-0.093 (0.035) 


-0.094 (0.028) 




Masks 


0.109 (0.021) 


0.109 (0.023) 


0.108 (0.024) 




Religious objects 


0.001 (0.023) 


0.001 (0.028) 


-0.001 (0.027) 




Ornaments 


-0.097 (0.026) 


-0.097 (0.036) 


-0.099 (0.029) 




Sculptures 


0.049 (0.02) 


0.049 (0.024) 


0.047 (0.023) 




Musical instruments 


-0.117 (0.033) 


-0.116 (0.045) 


-0.118 (0.038) 




Tools 


-0.084 (0.021) 


-0.084 (0.024) 


-0.085 (0.024) 




Clothing 


-0.068 (0.039) 


-0.068 (0.055) 


-0.069 (0.042) 




Textiles 


-0.04 (0.038) 


-0.04 (0.053) 


-0.041 (0.041) 




Weapons 


-0.097 (0.027) 


-0.097 (0.034) 


-0.098 (0.029) 




Jewels 


-0.045 (0.034) 


-0.045 (0.046) 


-0.047 (0.038) 




Yes vs No 1 


Specialized bibliography (dummy) 


0.14 (0.012) 


0.14 (0.02) 


0.14 (0.013) 




Comparative bibliography (dummy) 


0.118 (0.009) 


0.119 (0.021) 


0.119 (0.01) 




Exhibition (dummy) 


0.08 (0.014) 


0.08 (0.028) 


0.08 (0.015) 




Historicization: baseline Absent | 


Museum certification 


0.009 (0.015) 


0.009 (0.04) 


0.01 (0.017) 




Relevant museum certification 


0.015 (0.016) 


0.015 (0.042) 


0.016 (0.016) 




Simple certification 


0.032 (0.01) 


0.032 (0.031) 


0.032 (0.01) 








(continued m the n 


e.xt page) 
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Table 2. (continued from the previous page) 








FE 


RE 


ARE 


Region: baseline Central America | 


Southern Africa 


-0.164 (0.033) 


-0.165 (0.04) 


-0.165 (0.036) 


Western Africa 


-0.105 (0.012) 


-0.106 (0.017) 


-0.106 (0.012) 


Eastern Africa 


-0.161 (0.029) 


-0.162 (0.035) 


-0.162 (0.032) 


Australia 


0.038 (0.053) 


0.038 (0.088) 


0.037 (0.055) 


Indonesia 


-0.111 (0.027) 


-0.112 (0.046) 


-0.112 (0.029) 


Melanesia 


0.007 (0.016) 


0.007 (0.031) 


0.006 (0.016) 


Polynesia 


0.185 (0.018) 


0.184 (0.032) 


0.184 (0.02) 


Northern America 


0.232 (0.018) 


0.232 (0.056) 


0.232 (0.02) 


Northern Africa 


-0.374 (0.123) 


-0.375 (0.18) 


-0.375 (0.13) 


Southern America 


0.013 (0.023) 


0.012 (0.051) 


0.013 (0.025) 


Mesoamerica 


0.114 (0.021) 


0.113 (0.051) 


0.114 (0.021) 


Far Eastern 


-0.06 (0.139) 


-0.061 (0.315) 


-0.06 (0.15) 


Micronesia 


0.097 (0.076) 


0.097 (0.078) 


0.097 (0.08) 


Indian Region 


0.303 (0.096) 


0.299 (0.097) 


0.296 (0.092) 


Asian Southeast 


-0.064 (0.118) 


-0.066 (0.153) 


-0.067 (0.125) 


Middle East 


-0.514 (0.085) 


-0.513 (0.164) 


-0.513 (0.088) 


Type of material: baseline Ivory | 


Vegetable fibre, paper, 


plumage -0.046 (0.028) 


-0.046 (0.041) 


-0.047 (0.032) 


Wood 


0.078 (0.021) 


0.078 (0.029) 


0.077 (0.024) 


Metal 


-0.033 (0.028) 


-0.034 (0.046) 


-0.035 (0.033) 


Gold 


0.13 (0.032) 


0.13 (0.059) 


0.129 (0.037) 


Stone 


0.046 (0.03) 


0.046 (0.036) 


0.045 (0.034) 


Precious stone 


0.052 (0.033) 


0.052 (0.045) 


0.052 (0.037) 


Terracotta, ceramic 


0.007 (0.027) 


0.007 (0.044) 


0.006 (0.031) 


Silver 


-0.079 (0.048) 


-0.08 (0.078) 


-0.08 (0.047) 


Textile and hides 


-0.019 (0.033) 


-0.019 (0.058) 


-0.021 (0.04) 


SeasheU 


0.058 (0.054) 


0.058 (0.11) 


0.057 (0.059) 


Bone, horn 


-0.13 (0.036) 


-0.131 (0.07) 


-0.131 (0.039) 


Not indicated 


0.044 (0.045) 


0.044 (0.055) 


0.041 (0.052) 


Patina: baseline Not indicated | 


Pejorative 


0.235 (0.039) 


0.234 (0.044) 


0.234 (0.04) 


Present 


0.029 (0.011) 


0.028 (0.025) 


0.028 (0.013) 


Appreciative 


0.11 (0.012) 


0.109 (0.029) 


0.109 (0.013) 


Description on the catalogue: 


baseline Absent 




Short visual descr. 


-0.13 (0.037) 


-0.132 (0.101) 


-0.134 (0.038) 


Visual descr. 


0.039 (0.038) 


0.038 (0.103) 


0.036 (0.039) 


Broad visual descr. 


0.279 (0.041) 


0.278 (0.114) 


0.276 (0.043) 


Critical descr. 


0.269 (0.041) 


0.268 (0.119) 


0.266 (0.042) 


Broad critical descr. 


0.634 (0.046) 


0.634 (0.131) 


0.632 (0.048) 


Illustration: baseline Absent | 


Miscellaneous col. 


0.411 (0.02) 


0.412 (0.048) 


0.41 (0.021) 


Col. cover 


1.412 (0.11) 


1.411 (0.202) 


1.41 (0.113) 


Col. half page 


0.854 (0.023) 


0.856 (0.072) 


0.854 (0.024) 


Col. full page 


1.008 (0.025) 


1.008 (0.075) 


1.007 (0.024) 


More than one col. 


1.223 (0.028) 


1.223 (0.078) 


1.221 (0.029) 


Col. quarter page 


0.674 (0.021) 


0.675 (0.062) 


0.673 (0.021) 


Miscellaneous b/w 


0.41 (0.033) 


0.409 (0.055) 


0.406 (0.035) 


b/w half page 


0.551 (0.045) 


0.552 (0.084) 


0.549 (0.051) 


b/w quarter page 


0.304 (0.025) 


0.305 (0.075) 


0.303 (0.027) 


Auction 


house and venue: baseline Bonhams-New York | 


Christie's- Amsterdam 


0.765 (0.054) 


0.766 (0.073) 


0.756 (0.059) 


Christie's-New York 


0.702 (0.054) 


0.7 (0.055) 


0.69 (0.059) 


Christie's-Paris 


0.601 (0.05) 


0.6 (0.049) 


0.592 (0.054) 


Encheres Rive Gauche-Paris 0.536 (0.086) 


0.534 (0.044) 


0.523 (0.09) 


KoUer-Zurich 


-0.012 (0.052) 


-0.014 (0.075) 


-0.021 (0.059) 


Piasa-Paris 


0.753 (0.071) 


0.751 (0.052) 


0.74 (0.071) 


Sotheby's-New York 


0.866 (0.049) 


0.866 (0.046) 


0.856 (0.055) 


Sotheby's-Paris 


0.761 (0.05) 


0.761 (0.049) 


0.752 (0.055) 
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Fig. 3. Plots of autocorrelation functions of the level-2 residuals of the ARE model. 



Table 3. Prediction/forecasting performance 
of the three models over 100 out-of-sample 
units within the time span 1998-2011 (rows 
1-2) and over 281 units of the out-of-sample 
semester, 2011 - 1 (rows 3-4). 



FE 



RE 



ARE 



100 units within the time span 1998-2011 
MAE 0.280 0.280 0.280 

RMSE 0.342 0.342 0.342 

281 units in the semester 2011-1 
MAE 0.494 0.489 0.454 

RMSE 0.423 0.419 0.358 



report the prediction error over 100 (out of sample) items within the time span 1998-2011. In this 
instance, the three models present the same performance. The last two rows of the table show 
the forecast performance over all the 281 observations of the semester 2011-1. Such observations 
have not been included in the model so that the measures reflect a genuine one-step-ahead forecast 
performance. Clearly, the ARE model allows to obtain better forecasts of the prices of artwork 
objects through the autoregressive specification. 

In conclusion, if compared to the other two models, our new ARE model presents a better fit 
and superior forecasting performance. Although the estimates are similar to those of the hedonic 
regression model, the multilevel framework is more parsimonious and provides a natural flexible 
approach through the decomposition of the total variability of the response. The autoregressive 
specification is backed up by Art Economics theory that confirms that the process of formation of 
auction prices has short memory: indeed, in the case of Tribal Art, the dependence is upon the 
previous semester. 
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6. Conclusions 

In the present work, we have introduced a muhilcvel framework for the analysis of prices of artworks 
sold at auctions over time. The proposal combines the flexibility of mixed effect models, in that 
it allows to account for various sources of heterogeneity, together with the predicting performance 
of time series models. The latter component allows to specify a substantive model for the price 
dynamics over time. Since auction data do not constitute a proper panel or a time series we need 
a multilevel specification with items at the first level and time points at the second level. 

We have applied such specification to analyse the Tribal art market by using the first database 
on Ethnic artworks that contains information on more than 20000 items sold in the most important 
auction houses in the world. The results show that our approach gives a substantial advantage 
over the traditional hedonic regression model, especially in terms of degrees of freedom, parsimony 
and interpretability. In fact, the multilevel model retains the ease of interpretation of the hedonic 
regression model since the estimated regression coefficients can be still seen as shadow prices for 
each feature, and a price index for the art market is easily provided through the predictions of the 
time-effects. On the other hand, it has less parameters to be estimated and provides a decomposition 
of the total variability of the response. 

The dependence of the price over time has been modelled by means of an autoregressive speci- 
fication at the second level. Hence, we have extended the classic multilevel model by relaxing the 
assumption of independence among random effects and treating them as a time series at the second 
level. In order to achieve the task, we have derived full maximum likelihood estimators through the 
E-M algorithm and have implemented them in an original R-code. The results show that the new 
specification fully captures the temporal dependence structure among group-effects. Moreover, such 
model presents superior forecasting performance with respect to other proposals. In conclusion, we 
advocate the use of our specification as a natural choice for modelling artwork prices and possibly, 
obtain forecasts/predictions that might be valuable to auction houses, banks and investors. 

The work presented here can be extended in different directions; also, many applications are 
possible. First, it could be interesting to explore further the nature of the deviation from normality 
of level-1 residuals. This might be accomplished by inserting further variance components in the 
model, especially those related to the interactions between covariates. Also, possible volatility 
effects (ARCH/GARCH) can be inserted as to extend considerably the flexibility of the model and 
make it appealing from the point of view of financial applications. Moreover, the model could 
be applied to characterize and forecast other art markets. Lastly, in order to promote the usage 
of our model and to facilitate the reproducibility of the research we plan to release the software 
implemented as an R package. The latter project would contribute to fill the lack that hindered 
the practical use of multilevel models for repeated cross-sectional data. 
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